Frozen Sentences Of Portuguese: Formal Descriptions For NLP

نویسندگان

  • Jorge Baptista
  • Anabela Correia
  • Graca Fernandes
چکیده

This paper presents on-going research on the building of an electronic dictionary of frozen sentences of European Portuguese. It will focus on the problems arising from the description of their formal variation in view of natural language processing

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Thematic Connectionist Approach to Portuguese Language Processing

In the symbolic approach to Natural Language Processing (NLP), a system can only parse grammatically well constructed sentences. Within such a context, several linguistic phenomena, e.g. the thematic pattern relationships between the sentence constituents, can be accounted for (these pattern relationships are explained by a rule-based linguistic theory called thematic theory [1]). An alternativ...

متن کامل

Experiments in identifying frozen sentences

This paper describes an experiment on the identification of frozen sentences (or verbal idioms) from European Portuguese on large corpus of journalistic text. It aims at identifying the main difficulties (or shortcomings) resulting from the intersection of linguistic information encoded in the lexicongrammar with finite-state transducers that are then applied to texts. The paper shows that, for...

متن کامل

Some Experiments on Clustering Similar Sentences of Texts in Portuguese

Identifying similar text passages plays an important role in many applications in NLP, such as paraphrase generation, automatic summarization, etc. This paper presents some experiments on detecting and clustering similar sentences of texts in Brazilian Portuguese. We propose an evalution framework based on an incremental and unsupervised clustering method which is combined with statistical simi...

متن کامل

Automatic Alignment of Common Information in Comparable Sentences of Portuguese

The ability to recognize distinct word sequences which refer to the same meaning is of extreme relevance for many applications in NLP, such as automatic summarization, question answering, generation, etc. In this paper we describe our first attempt at aligning common information between portuguese similar sentences. We propose a method based on lexical and syntatic information and some paraphra...

متن کامل

Extraction of Definitions in Portuguese: An Imbalanced Data Set Problem

Definition extraction is an important task in NLP and IR fields in the context of e.g. question answering, ontology learning, dictionary and glossary construction. When addressed with learning algorithms, it turns out to be a challenging task due to the structure of the data set, the reason being that the definition-bearing sentences are much fewer than the sentences that are non definitions. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004